# Full text of "mit :: lcs :: tm :: MIT-LCS-TM-017"

## See other formats

m

COMPLEXITY MEASURES FOR PROGRAMMING LANGUAGES

Technical Memorandum 17

(This report was reproduced from an M.S. Thesis, MIT,
Dept. of Electrical Engineering, September 1971.)

Leonard I. Goodman

September 1971

PROJECT MAC
Massachusetts Institute of Technology-
Cambridge Massachusetts 02139

ACKNOWLEDGEMENTS

I wish to thank Professor John Donovan, the supervisor of
this thesis, for his encouragement, enthusiasm, and guidance
during the research and preparation of this report. His help
is deeply appreciated,

I acknowledge fellow graduate student Jerry Johnson for
the many discussions we had during the early formulation of
this work, and Cathy Doyle for her typing of this thesis report.
Finally, I thank my wife Hindi for her patience and encour-

Work reported herein was supported in part by Project MAC,
Projects Agency, Department of Defense, under Office of Naval
Research Contract Number Nonr-4102 (01). Reproduction of this
document, in whole or in part, is permitted for any purpose of
the United States government.

COMPLEXITY MEASURES FOR PROGRAMMING LANGUAGES*
Leonard I . Goodman

Abstract

A theory of complexity is developed for algorithms imple-
mented in typical programming languages. The complexity of a
program may be interpreted in many different ways; a method for
measuring a specific type of complexity is a complexity measure
-- some function of the amount of a particular resource used by
a program in processing an input. Typical resources would be
execution time, core, I/O devices, and channels.

Any resource whose use is independent of previous and future
usage can be handled by the theory. This condition includes time
but excludes space complexity. For a specific measure, the com-
plexity of the basic programming elements can be determined and
used to compute the complexity of an arbitrary program with a
particular input. Because this method gives little information
about the general complexity behavior of the program, another
approach is developed.

This new approach analyzes the complexity of a program with
respect to a valid set of inputs -- a finite set of legitimate,
halting inputs. A program equation is developed to make the trans-
formations undergone by the inputs more explicit. Using the equa-
tion, the input set is partitioned into classes of constant com-
plexity. The classes are used to compute maximum, minimum, and
expected complexities of the program on the input set.

Several equivalence relations are defined, relating different
programs by their complexity. Complexity is also discussed in terms
of concatenation and functional equivalence of programs.

*This report reproduces a thesis of the same title submitted to
the Department of Electrical Engineering, Massachusetts Institute
of Technology, in partial fulfillment of the requirements for the
degree of Master of Science, September 1971.

Chapter I - Introduction 6

1. Functions^ Algorithms, Programs 7

2. Complexity Measures 9

3. Previous Work 10

4. Graph >k)dels of Programs 11

Chapter II - Complexity of Arbitrary Programs for

Single Inputs 13

1. Introduction 13

2. Constraints on Complexity Measures 13

3. Complexity of Basic Program Elements 15

4. Complexity of Arbitrary Programs 22

5. The Set Approach 23

Chapter III - Program Equations and Set Complexity 24

1. Introduction 24

2. Input Sets 24

3. Program Equations 25

4. Complexity Equivalence Classes 31

5. Conclusion of Example Program 35

6. Summary 37

Chapter IV - Complexity of Advanced Constructs and Input

Schemes 38

1. Introduction 38

2. Subroutine and Function Calls 38

3. LOOP Blocks 44

4. Multiple Inputs 48

5. Different Data Types 51

6. String Input 52

Chapter V - Results in the Complexity Theory of Programming

Languages 53

1. Introduction 5 3

2. Preliminary Definitions 53

3. Relations Between Programs with Identical Input

Sets 56

4. Concatenation 67

5. Functional Equivalence 73

Chapter VI - Conclusions and Suggestions for Further Study 76

Appendix - Mathematical Notation 80

References 84

-6-

Chapter I. Introduction

Within the past ten years^ there has been an increased
interest by computer scientists and mathematicians in the theory
of computational complexity. This theory is concerned with
measuring the difficulty of computing functions and with studying
the properties of measures of computational difficulty. Most
of the work done in this field has remained within the domains of
recursive function theory and the analysis of Turing machine
computation of functions. (See, for example, Hartmanis and
Hopcroft [1] for an overview of complexity theory.) One area of
complexity theory that has not received much attention is the
analysis of functions represented by computer programs. The
current research is directed towards this area.

We attempt to apply the basic principles of computa-
tional complexity theory to algorithms which are Implemented
In typical programming languages. One of the basic ideas of
this complexity theory of programming languages Is to view the
complexity behavior of a program over a finite set of 'Valid
Inputs", rather than over some infinite domain or for only one
Input. A 'Valid Input" Is one for which the program in question
halts and which the program is actually intended to process.
Looking at the complexity for all the elements in a set of this
type will enable us to get a better picture of the complexity
behavior of the program. We will also be able to define some
relationships between different programs if the complexities

of their elements relate in certain ways,

1. Functions, Algorithms, Programs

We have mentioned the concept of computational complex-
ity with regard to functions^ algorithms, and programs without
clearly defining these three terms and explaining the differences
between them.

A function defines an association between the objects
of one set (the domain of the function) and the objects of
another set (the range). The method of determining the object
in the range set corresponding to the object in the domain need
not be explicitly stated; a set of ordered pairs of the form

(domain element, range element)
completely defines a function. Even if a rule for the function
is given (e.g., via a lambda expression [2]) the evaluation of
the rule may remain indeterminate. For example, if we have the
function

\x. x^fx*x
do we perform the addition first or the multiplication?

The computational complexity of a function would
have to measure the difficulty of computing the range element
given a domain element, no matter what rule was used to determine
the range element (more than one rule may specify the same
function), or how the rule was evaluated (as long as the evalua-
tion procedure produced the correct answer). Complexity theory
of functions is outside of the current discourse.

-8-

An algorithm is either the specification of the rule
which defines a function and the method for evaluating the rule,
or simply the evaluation procedure for a given rule. An algorithm
is frequently presented in terms of a flow chart, where each
of the nodes of the chart represents a basic operation whose
meaning and evaluation are (hopefully!) unambiguous.

The complexity of an algorithm is more basic than
that of a function. In the case of an algorithm, we need only
examine the specific evaluation procedure defined by the algorithm
in order to determine the complexity behavior we wish to observe.
However, in our complexity analysis, we will eventually come down
to analyzing the basic elements of the algorithm: arithmetic
operations, assignments of values to variables, testing condi-
tions, branching, etc. The complexity of these simple operations
generally cannot be specified any further. We may choose to
express these operations in terms of the corresponding Turing
machine operations and deal with Turing complexity. Although
this may be adequate in some cases, the complexity of an algorithm
on a Turing machine does not give much Insight into the complexity
of the same algorithm written in a programming language and run
on a computer.

A program is the Implementation of an algorithm In a
particular programming language. If the program is written in
the assembly language of a particular machine, we can determine
the complexity of the basic elements of the programs in terms of

-9-

the characteristics of that machine. If the program is written
in a high-level language^ the complexity properties of the basic
operations would not be completely constrained until we specify
which machine the program will run on (and probably how the
language translator to be used on the program would work). How-
ever^ we may choose to leave the complexity in terms of the basic
operations of the high-level language.

2. Complexity Measures

The computational complexity of a program may be inter-
preted in many different ways. A scheme for measuring a specific
t3^e of complexity will be called a complexity measure . Basically,
a complexity measure is some function of the amount of a partic-
ular resource used by a program as it processes a specific input
value. This resource might be time, space, CPU usage, channel
activity, etc. We might have a program with input n. Asso-
ciated with is a measuring program $which is 'faonltoring" the execution of 0,$(n) would tell us the amount of a particular
resource used by to compute 0(n). Thus the program $is mea- suring the complexity of 0. In the recursive function formulation of complexity theory, a complexity measure is a recursive enumeration of all partial recursive functions , to each of which is associated a step-counting function$.. f. is constrained to satisfy
the following two axioms (from Blum [3]):

1. 0^(n) is defined iff $^(n) defined -10- 2, M(i^n,in) = JO if$^(n) ^m is a recursive function
I 1 if $^(n) = m By defined, we mean that a function halts for a particular input. Uses of Complexity Measures As we have stated, a complexity measure provides infor- mation on the resource usage of a program. As long as programmers have been writing computer programs, they have been concerned with the resource usage of their programs; specifically, they have wanted to know how long their programs would run and how much core they would require. As multlprocessed and time-shared computer systems evolved, programmers wanted to know about the use of system resources other than CPU time and core: channel usage, device usage, secondary storage requirements, supervisor usage, etc. The amount of each of these resources used by a program would constitute a different measure of Its complexity. A theory of computational complexity would give us a method for quantitatively analyzing the complexity behavior of computer programs and for comparing different programs on the basis of this behavior. Hopefully, this theory should be some- what independent of which type of complexity is being measured, so that the same techniques would be suitable for a number of different resources. 3. Previous Work There has been little work done in the area of complex- ity measures for programming languages. Meyer and Ritchie [4] -11- found some weak bounds on the running time of a class of pro- grams called Loop Programs. These programs compute exactly the primitive recursive functions. However^ programs written in most languages will compute recursive functions which are not primitive recursive. Thus, the Loop Program analysis is not general enough. Ramamoorthy [5] studied the time complexity of programs which could be modelled by discrete Markov processes: each decision-making element in the program is statistically indepen- dent of all others. He felt that his analysis would be useful for micro-programmed instruction sequences. However, the tech- niques developed would not work in the case of an arbitrary program. 4. Graph Model of Programs We will be using the graph model to represent the structure of computer programs. We present here an informal definition of this model. A graph of a program consists of a set of nodes connected by a set of directed arcs . The nodes represent the statements or elements of the program; they will be labelled with a statement identifier or function name. The arcs represent the flow of control in the program. They determine the execution sequence of the nodes. More than one arc may leave a particular node. In this case, each arc will be labelled with a unique selector that determines which node will be executed next. -12- ^ successor of a given node is a node which is pointed to by a directed arc from the given node. A predecessor of a given node is a node which points to the given node. A node may precede or succeed itself. A node with no arcs leaving it is a terminal node ; it is the last node to be executed. A program graph may have more than one terminal node. The node which is pointed to by an arc that has no node at its other end is the starting node . A graph may have only one such node. The arc leading into the starting node may have an input value or set of values at its other end. The entire graph will usually be labelled with the name of the program. ThuSj the program A consisting of the statements Sj^, s , 3^ where a^ is a conditional statement with two possible successors would be represented as : A: X X is the input value; s is the starting node, s is the terminal node. The successors of s^ are s and s^; the predecessor of s is s . T and F are selectors for s -13- Chapter II. Complexity of Arbitrary Programs for Single Inputs 1* Introduction Before we analyze the complexity behavior of programs over a set of inputs, we will first present methods for determining the complexity of a program with respect to one input. This will involve defining the complexity of the basic programming constructs to be used in the programs, and specifying a set. of rules which will enable us to compute the complexity of a group of basic con- structs which have been combined. These rules will determine the types of complexity measures for which our methods and techniques will be valid. It happens that these same rules will be suffi- cient for analyzing complexity behavior over a set of inputs, 2. Constraints on Complexity Measures We will require that our measures of complexity satisfy three rules. The first two rules are the axioms of Blum presented in Chapter I, The first axiom, 0.(n) defined Iff i (n) defined, implies that the measuring function (program)$. must depend on
the entire computation of 0. on Input n; if this were not true,
then $might produce an answer even though did not halt. This axiom also Implies that when 0. terminates, we have all the neces- sary Information to determine the complexity. The second axiom, M(l,njm) = To if$ . (n) ^ m is a recursive function

Co if $^(n) ^ m is Ll if$^(n) = m

states that we can always tell if operating on input n will have

-14-

complexity m. In the case of time complexity we could let our pro-
gram 0^ with input n run until it had executed for m time units.
If 0^ halted, M(i,njm) = 1. If continued to execute, M(i,n,m) = 0.

The third rule is the linearity constraint . It is this
rule which allows us to find the complexity of a group of basic ele-
ments which are formed into a structured program. The constraint
may be stated as follows:

Let A be a program with input x, such that A can be
divided into two segments s and s where s is the predecessor of
S2. A graph for A would be:

A:

We can represent the complexity of A with input x as C(A,x). Simi-
larly the complexity of s- is C(s ,x). However, s does not have
input X but rather some transformation upon x, induced by s • We
can represent the input to s^ as s-(x). The complexity of s^ is
then C(s2>s-(x)). Linearity requires that for all x for which A
halts,

C(A,x) = C(s^,x) + C(s2,s^(x))
Another way to state the linearity constraint is that
any use of the resource in question must be independent of how
much was used previously or how much will be used in the future,
but dependent upon transformations of the input.

-15-

Time of execution satisfies this constraint. Other
resources which also do are the number of calls to the supervisor
program, device usage, and channel usage (from the point of view of
the number of times the channel is used). One resource that does
not generally satisfy the constraint is the amount of core used by a
program. Since core may be shared, it may be available for different
uses at different times. In the previous example, some of the space
needed for S2's computations may be done in s 's area. Thus,

C(A,x) ^ C(s^,x) + C(s2,s^(x))

However, if space is never reused, then space complexity may be
incorporated into the general theory. For any resource which obeys
this constraint, the methods to be presented may be used to deter-
mine the complexity of a program with regard to the use of that
resource.

3. Complexity of Basic Program Elements

Our goal is to be able to determine the complexity of an
arbitrarily structured program. First, we will define the com-
plexity of the basic elements of our language. This language
will not be any specific one but will be representative of modem
high-level languages. Below, we list one possible set of basic
elements. Naturally, we cannot include every possible program
construct; however, those listed are found In many languages.

Arithmetic operations

Assignment

-16-

Transfer of control

Conditionals

Iteration

Function calls

I/O and supervisor calls
In discussing each of these elements^ we will use time
as a sample resource. Of course^ "time" may be expressed in micro-
seconds, GPU cycles, or any other units. Other complexity measures
may be handled similarly.

Arithmetic Operations

This category includes the common mathematical opera-
tions. The time complexity of any of these operations is just the
time required to execute it. If we are dealing with an assembly or
machine language program, we may express the time in terms of the
instruction execution time of the corresponding machine. If we
are writing in a high-level language, and if we know the computer
which will execute the machine code resulting from our program,
we may express the time complexity in a similar manner (ignoring
any compiler optimization).

If we do not know which machine will execute our program,
we may choose to measure the complexity in terms of the number of
additions, the number of multiplications, and the number of other
independent operations. We may think of having an n-dimensional
"complexity vector", where n is the number of independent opera-
tions. The "unit vector" for each of the dimensions is the com-

-17-

plexity of a basic operation; the coefficient of this unit vector
is the number of such operations which have occurred. We can reduce
this vector only if we express the unit complexities in terms of
something else^ such as the machine instruction times of a specific
computer. Then we can obtain one value for the complexity^ as we
did in the case of an assembly language program.

Note that we are treating the complexity of these opera-
tions as having a fixed value, independent of the value of the
operands. If this assumption were not true, we would have to examine
the sub-operations which form an operation until we found some con-
structs which were complexity- invariant. We will need this condi-
tion when we examine complexity for a set of inputs.

Assignment; Transfer of Control

By assignment, we mean the assigning of the value of one
variable to another. Transfer of control is the familiar BRANCH
or GOTO statement. These two constructs may be treated in the same
way as the arithmetic operations: either their complexity may
be expressed in terms of instruction execution time or they may
be treated as two of the independent operations. We also assume
that the complexity of these operations is a fixed value.

Conditionals

The prototype of our conditional statement will be:

IF p(x) THEN 8^ ELSE S2

■18-

P is some test on the value of x which does not change this value.
If this test is satisfied (p(x) TRUE), then s^ will be executed;
otherwise S2 will be executed. We can represent the structure
of the conditional by the following graph model:

Using the linearity condition and noting that p does not change
the value of x, the complexity of our conditional will be:

l*x) = C(p,x) +|c(Sj^,x)
LC(s2,x)

C(cond,x) = C(p,x) +fc(B^,x) if p(x) TRUE

if p(x) FALSE

p, 8^, and S2 may all represent complex constructions, which can be
broken down to basic elements.

Iteration

In most languages, the programmer has the ability to
execute a section of program repeatedly, depending upon certain
conditions. This Is the basis of the Iteration statement. We
usually have a variable, defined only for the iteration construc-
tion, whose value is incremented from a lower limit to some upper

-19-

limit while the statements within the bounds of the iteration state-
ment (the body ) are executed for each increment of the variable.
Some languages allow additional features: multiple ranges for the
control variable, negative increments, negative lower limits,
attaching a conditional test to the iteration, and others. Examples
of the iteration construction are the FORTRAN DO statement, the
ALGOL FOR statement, and the PL/l DO statement.

We will use a particularly simple form of iteration state-
ment. We will retain only the concept of executing a body of state-
ments for a certain number of times. This is the LOOP block and
has the following form:

LOOP N

{body}

END

The semantic interpretation of the LOOP block is that the body is
executed N (contents of N) times in succession. Changes to N
within the body do not affect the number of times that the body
Is executed. The body may contain other LOOP blocks; thus they
may be nested to any level.

The complexity of a LOOP block may be interpreted in
several ways. We may view the LOOP structure as equivalent to N
physical copies of the body of the block. We may then compute
the complexity using linearity. Alternatively, since the LOOP
block is usually the feature of a high-level language, we may

-20-

examine its translation in machine langtiage. This will involve
the initialization of a dummy variable, the body of the block, the
incrementing of the variable, a test to see if we have exceeded the
number of iterations, and a transfer to the beginning of the body.
The complexity of the block could then be computed using the
complexity of these elements and the linearity condition.

We will use the simpler interpretation of complexity.
If we denote the body of the LOOP block by s, and assume that all
statements within s are a function of the variable x, then the
LOOP block

LOOP N

s
END

will have complexity

C(LOOP-BLOCK,x) = C(s,x) + G(s,s(x)) +C(s,s^(x)) +
C(s,s^(x)) +...+ C(s,s^"\x))

where s denotes the functional composition of s with Itself i
times.

Function Calls

A function or subroutine call involves a call to the
subprogram executed in the calling program, and the execution
of the subprogram itself. We will assime that the call state-
ment is a basic construct similar to an arithmetic operation and

-21-

that its complexity is independent of its argument; furthermore,
we will assume that it does not change the value of its argument.
The value of the call statement argument , and hence the input to
the subprogram, will be some transformation of the original input
to the calling program. As an example, consider the program A
with input x which contains a call to the function B with argument
X (actually some transformation on the original value of x). A
could be represented in graphical form as:

The complexity of A would then be

C(A,x) = C(Sj^,x) + C(GALL) + C(B,Sj^(x))
+ C(S2, B«Sj^(x))

All occurances of x in this expression denote the original input
value of X.

I/O and Supervisor Calls

We can view these constructs in the same light as
arithmetic operations. For a particular machine, the time com-
plexity of these elements is simply the time needed to execute
them. In a high-level language, we may treat them as independent

-22-

operations.

Types of Complexity

Certain elements have a constant complexity which does
not depend on their inputs. This group includes arithmetic opera-
tions^ assignment^ transfer^ and the supervisor operations. Condi-
tional statements have a constant complexity for those inputs for
which p(x) is true^ and another constant complexity for those inputs
for which p(x) is false. This assumes that the elements composing
Pj s^ and s^ are all of constant complexity. Finally^ we have
loop's and subroutine calls which have a complexity dependent upon
the input and number of iterations for the LOOP block and the input
for the subroutine or function.

If certain supervisor operations have different complexi-
ties for different inputs, we may treat them as subroutines.

^. Complexity of Arbitrary Programs

Having defined the complexity of our basic programming
constructs, we can use these complexities and the linearity condi-
tion to find the complexity C(A,x) of any program A with any
input X for which A halts. (We may easily extend our results to
include the case of programs with multiple inputs. ) This approach,
however, does have shortcomings.

The complexity of a program with respect to one input
will not usually give us much information about the general com-
plexity behavior of the program. We will know even less if the
program does not halt for that input. If we wish to learn more

-23-

about the complexity behavior, we will need to repeat our calcu-
lations for a large set of inputs. Although the program may
exhibit the same complexity behavior for many different inputs,
we will not be able to take advantage of this relationship because
we are dealing with each input individually. The domain of input
values is arbitrarily large until we bind our program to a partic-
ular machine or language specification. Because of the nature of
this domain, we cannot bound the values of C(A,x) since A defines
an arbitrary partial recursive function. Finally^ we have no
facility for comparing the complexity behavior of two different
programs other than inputting the same value to both programs and
calculating the resultant complexity.

5. The Set Approach

The remaining chapters will use the work on single input
complexity to develop another approach to program complexity. This
approach examines the complexity behavior of a program over a
finite set of inputs which have certain useful properties. The
set approach will make it easy to examine such behavior as the
expected value of complexity and maximum and minimum complexity;
the complexity structure of a program will become more apparent.
Also, the set approach will enable us to examine complexity rela-
tions between different programs.

-24-

Chapter III. Program Equations and Set Complexity

!• Introduction

Having defined the complexity of the basic program
elements and developed methods for determining the complexity of
a program with respect to one inputs we are ready to deal with
program complexity with respect to a set of inputs. We develop
the concepts of the program equation and a valid set of inputs
to aid in the complexity analysis. We make some simplifying
assumptions about the type and number of Inputs to our programs
and the type of components in these programs. In the next
chapter we remove these restrictions to obtain a more general
model. An explanation of the mathematical notation used in
this and later chapters will be found in the appendix.

2. Input Sets

We will assume a simplified form of input structure
for our programs. These programs will have only one input,
which will be a non-negative Integer; further, we will assume
that all operations upon this input result in non-negative Integer
values. Given a program A which satisfies these conditions,
let U- be the set of non-negative Integer Inputs for which A
halts, and which are valid inputs to A. By a valid input,
we mean an input which A Is actually Intended to process.
Thus, if A is meant to process only even non-negative Integers,
then U^ contains only these integers, though A may halt for
some odd integers or, in fact, for all odd integers.

-25-

Next, let U2 be the set of allowable non-negative
integer values for the machine on which A is running or for
the language tn which A is written. (If both the language and
the machine limit the set of values, we will use those conditions
which are more restrictive.) V^ may be quite large, but it is
always finite. We can now define a valid input set X to the
program A as :

X = u^n U2

We see that X is finite and for all inputs x in X, A halts.

We may be interested in examining only some of the
valid Inputs to A at any particular time. If X'OC, we will
say that X* is also a valid input set to A. x' has the same
properties as X except that it does not contain all the valid
halting Inputs for A. We will refer to X as the maximal valid
Input set to program A for a particular U2 - i.e., for a
particular machine or language realization.

3. Program Equations

We now define a method for obtaining an equation
representation for a program. We will initially assume that
the program contains no LOOP blocks or subroutine calls, and
has only one input.- These restrictions will be removed later.

The concept of a program equation is based on the work
of Zeiger [6]. He derives some relationships between programs,
polynomials, and power series. We start the derivation of

-26-

the equation by putting our program A into graphical represen-
tation. With each node of the graph, we can associate a func-
tion on the inputs to that node. Since we have temporarily
eliminated LOOP's and subroutines, we have remaining arithmetic
operators, assignment, transfer, conditionals, and supervisor
calls. Transfer does not change any values. Assignment is a
type of arithmetic operation, and supervisor calls change values
of variables in specified ways. Thus, we have two general types
of functions: arithmetic and conditional.

An arithmetic function f maps a set X into another set
F, defined as:

F = f (X) = { f(x) I xeX }

A conditional function p maps X into a set P. We will say that
p(x) « X if X satisfies the predicate p; otherwise p(x) is
undefined. Therefore, p acts as an identity function for those
inputs which satisfy its associated conditional test. We can
see that p(x) is defined if and only if p(x) = x. Then

P = P(X) = { xeX I p(x) is defined }

We are using the conditional function in a different
sense than the conditional statement of the preceding chapter.

IF p (x) THEN s ^ ELSE 82

■27-

p(x) is assumed to take either the value TRUE or FALSE. If
p(x) = TRUE, we execute s^^; otherwise, we execute S2. In the
case of a conditional function, we have the following situation;

p(x) defined / \ p(x) defined

We define the function p(x) as follows:

p(x) defined iff p(x) undefined

If p(x) is defined, its value is x; we take the arc with the
appropriate selector and ezecute s- with input x. We cannot
execute s^ because the selector leading to this node was not
satisfied.

Conversely, if p(x) is undefined, p(x) is defined;
its value is x. We then execute s^ with input x. One case or
the other (but not both) must happen. We also have the relation

p(x) = p(x)

so that if our conditional function is p(x), our selectors will
remain the same. We will abbreviate ''pCx) defined'' by '*r"

-28-

(TRUE) and '^(x) defined" by ''F'' (FALSE). This is in keeping
with the standard notation of program conditionals.

Having specified the function associated with each
node of the program graphs we define the equation of the pro-
gram as the summation of all possible functional paths from
the starting node of the graph to any terminal node, A func-
tional path is defined as the composition of the functions
associated with the nodes which comprise the path. The resulting
composed function is applied to the set of inputs. The stun
of chese functional applications is set equal to the function
defined by the entire program applied to the set of inputs.

Given a program A with input set X, its equation would
be

A(X) = Zf,(X)

i=0 ^

where each f is a composition of simpler functions. The
summation is of infinite extent because an arbitrary program
graph contains an infinite number of paths.

We illustrate these concepts with a sample program:

A:

-29-

The simplest path through A starts at p and takes the F arc
to the halting node. We will assume that this latter node has
no functional or complexity significance. If we recall that
F is equivalent to '^(x) defined", the first term of A(X) will
be

f q(X) = p(X) « { xeX I p(x) defined }

The next simplest path starts at p, takes the T
arc to g, returns to p, and then takes the F arc to the halting
node. The composition of the functions encountered in this
path is (with the right-most function encountered first)

f j^ = P»g-P

This function applied to X yields the next term of A(X):

f^(X) ^ [ g(x) I xgX & p(x) defined & p(g(x)) defined ]
Continuing In this manner we get the complete equation:

A(X) »7(X) + P*g-p(X) + p.(g.p)^(X) + ...

= Ep«(g«p)^(X)

i«0

where (g»p) = the identity function. Xhsxe Is no unique corre-
spondence between a particular index and a particular functional
path. In this case, we have calculated functions in order of

-30-
increastng path length. It happened that

f^ ^ P*(g'P)

However, any other one-to-one correspondence between the f.
and the functional paths would work. We will conclude this
example later in the chapter.

The application of f to X results in a set of values.
We define

F^ = f^(X)

The 4-*s in our equation are to be interpreted as set union. Then

F.C A(X) and U F^ = A(X)
^ i=0 ^

Each f Is generally composed of arithmetic and
conditional functions. Suppose f . = g •...•g.. We will say
that f.(x) Is defined If and only if for all conditional
functions g €{ S^^g^.j* •••» gj^ }^ g. applied to its argxanent
is defined; I.e., g.(g, , •. . . 'g, (x)) is defined. Then
f^(x)€F^ if and only if f^(x) is defined. Now we can define
the set of elements in X which exactly yield each F :

X^ = [ xeX I f^(x) is defined }

We then have the relation F = f.(X ),

-31-

Simplification of the Equation

We can reduce our equation if we require that X be
a valid input set to A. , We then have that X is finite and for
all inputs in X^ A halts. Since X is finite, there are only
a finite number of non-empty X . Let m be the greatest integer
for which X ?^ (the empty set); then for all i>m, X =0 and F,=0.
We then have:

m

1. A(X) = Zf.<X)

i=0 ^
ni

2. U X = X (it may be that some of the X are empty;
i=0 ^ ^

this will occur if the corresponding F.

is empty)

3. X nx =» if if^j (because programs are deterministic,

any input undergoes only one trans-
formation)

4. Each f is composed of only a finite number of sub-
functions since A halts for all x in X.

Thus, given a valid input set, we can reduce our program equa-
tion to a finite number of terms, each term of finite extent.
We also get a partition { Xq,X ,...,X } of our original input set.

4. Complexity Equivalence Classes

The subsets of X, { X. }, have the following property:
for all xeX., the functional path associated with x is the same.
Since we have allowed no LOOP blocks or subroutines in our
programs, and since all x in X. take the same branch from every

-32-

conditional test, the complexity of this functional path is con-
stant. This complexity is completely defined in terns of the
functions in the path and has a value C(A,x)j xcX . X
will be called a complexity equivalence class . Associated
with each X. will be an equivalence class complexity * C(A,X. ),
By definition.

C(A,X^) = C(A,x), xgX^

The complexity information for program A with input
set X is contained in the two sets

{ Xq,X^,...,X^ }

{ C(A,X ), C(A,X.),_.,C(A,X ) }
u i m

We observe that many programs do not treat every input differently.
Often, many inputs will be similarly processed. Therefore, we
are led to believe that the number of equivalence class will
often be smaller (though not always much smaller) than the
number of distinct input values. In this case, we have a (rela-
tively) compact description of the complexity behavior of A.

Uses of Complexity Equivalence Classes

The complexity equivalence classes may be used to
calculate the following quantities with respect to a particular

input set:

1. Expected value of complexity

-33-

2. Maximum value of complexity

3. Minimum value of complexity

In a later chapter we will show how these classes are useful
in determining relations between different programs.

Expected Complexity Value

Expected complexity value will tell us the average
resource usage by a program over a particular input set. The
expected value of a function g over a discrete-valued domain
X is (see, for example, Drake [7]) the sum of the products of
values of the function at points in the domain and the prob-
ability that the particular point will be chosen:

E(g,X) = s g(x).Pr(x)
xeX

If C(A,X) is the expected value of the complexity of A with
input set X,

C(A,X) « E C(A,x).Pr(x)
xeX

m
Since U X « X,

i«0 ^

C(A,X) - E S C(A,x).Pr(x)
X^xeX^

But for all xgX^, C(A,x) = C(A,xp; and since x^cX^ and x^eX^
are independent events.

Pr(X ) = S Pr(x)
xeX^

■34-

Th en.

S C(A,x).Pr(x) = r C(A,X,).Pr(x)

xeX^ xeX.

C(A,X^)(2 Pr(x)) = C(A,X^).Pr(X^)

xeX^

/. C(A,X) = E C(A,X )-Pr(X )

The probabilities Fr(X.) are based on any method
used in selecting the inputs to A, and take into account any
1 P^^-o^^^ knowledge of the selection method. If the selection
of inputs is random,

Pr(xp » I X^ |/| X

If X. « 0, then for any selection method, Pr(0) « 0. The value
of C(A,X) is not changed by averaging over empty sets.

Maximum Value of Complexity

The maximtm complexity is an upper limit on the usage
of a particular resource for any execution of a program, given
an input value from the input set in question. To compute
this quantity, we need only find the maximum value of C(A,X. )
over the X . We must remember that some X may be empty; so
we only look at C(A,X ) for non-empty X.. The maximum com-
plexity will be denoted C (A,X).

■35"

Minimum Value of Complexity

The minimum complexity is the corresponding lower limit of

resource usage by the program. It is the minimum of C(A,X.)

over the non-empty X,. It will be denoted C (A,X)

^ min

^max^^^^-^ ^"^ ^min^^'^^ ^^^ important quantities. They

say that any execution of the program will require C CA^X) of

min^ ^ ^

resources, but never more than C ^ (A,X), when inputs are taken

max

from X.

5. Conclusion of Example Program

Returning to our sample program introduced earlier in this
chapter, we will specify the input set and the functions com-
prising the program:

X= { 1,2,3,. ..,10 }

g(x): x:=2lc (x is assigned the value 2x)

p(x) = X iff xfi5

p(x) = X iff x>5

A(X) = E f (X)

1=0 ^

— — — 2

fo "" P ^1 " P*(g'P) ^2 " P*(S'P) - • •

^0 " ^0^^^ = { xeX I 7(x) is defined } = [ 6,7,8,9,10 }
Xq = f 6,7,8,9,10 }

F^ = f^(X) = p([ 2,4,6,8,10 }) = { 6,8,10 }
X^ = { 3,4,5 }

.^-S*^-^;^

-36-
Fj - fjCX) -?({ 4,8 }>*{ 8 3

:. ax-,» X ;.

;. (Vna4) [F^- X^- 0]

;. A(X) - i£,(X)

1-0 *

i'Sii',,

■ '-'-O ;• •

P<^M?^^»>f «eXQ

X v\t-^-:-^-'':^^*<^'-^r-^^.Cr

''■;;'3b.;>;

Nota thftt eh« conplaxlty of the conditional tMt (p lt»«lf)
10 eh« MM i»h«eher p(x) or p(x) U A^tis^^ ttmt^rm.

C(p>x) - C(p,x) - c , VxeX

P

C(A,Xj^) - C(A,x), xeX^

C(A,x) - C(p,ae>i^^(i;fc>4 C(?,g(k»

c + c_ + c « 2 c + c

.^ ,^ . V

K:>{ilisrj :.F ;; ^)? ■; A'>-^.

C(A,X^) - 3Cp + 2Cg
C<A,X-) - 4c + 3c

i 'Ol'l%?^S,,,?t.d ■| ■*«'^Jt

•37-

To determine C(A,X)^ the expected complexity^ we need to know
the values of Pr(X^). Let us assume a random distribution.

Then

Pr(X^) = I Xj/| X I

C(A,X) = i C(A,X,).Pr(X.) =
i=0 ^ ^

5 c +3 (2c +c ) + 1 (3c +2c ) + 1 (4c +3c )

loPIo PS loPS loPg

= 9 c + 4 c
5 P 5 8

Cmax<^'X> = C<A,X3) = Uc^ + 3Cg

6. Suranary

We have developed several important concepts for the
complexity study of programs - valid input sets, an equation
representation of a program, complexity equivalence classes,
and equivalence class complexities. We have seen the use of
these concepts in the analysis of complexity behavior. Finally,
we have shown how the equivalence classes and their associated
complexities have given a compact and orderly representation of
the complexity information for a program over a set of Inputs,

■38-

Chapter IV. Complexity of Advanced Constructs and Input Schemes

1. Introduction

In this chapter we extend our complexity analysis pro-
cedures to include programs with additional programming features,
such as subroutine calls and LOOP blocks, and also programs with
more than one input. We also discuss input data types other than
the non-negative integers and finally variable length inputs.

2. Subroutine and Function Calls

Let us assume that we have a program A with valid input
set X. A has the finite equation representation

A(X) ^If. (X)

i=0 ^

We can express the j term, f ^ as a composition of subf unctions:

Suppose g. is a call to the program B. Let Y be the maximal valid
input set to B. The equation for B Is then

B(Y) = S h.(Y)
1=0 ^

B will be called from A at point g. within f . with a set of values
X' = gj^.....g^(X)

-39-

where the series of composed functions is the transformation

applied to the elements of X, If we assume that A is working

correctly, then all values passed from A will be valid inputs

to B. Thus X'c Y. The equation for B(X') will have, at most,

as many terms as that for B(Y). Then

q
B(X') = Sh (X'), qsp
i=0 ^

To obtain the complexity equivalence classes for A with

input set X, we cannot simply calculate the X corresponding to

the F- = f . (X). The reason is that there is no unique complexity

associated with X since f contains a call to B which has q-fl

distinct functional paths associated with it. To get the true

equivalence classes for A, we will have to substitute the terms of

B(X*) into A(X) between g- - and g in f . Assuming no other

function calls in A, this procedure will yield an expanded set of

functions f *, each having a unique functional path. Any value
It

computed in B and returned to A will be available to gi^. n* This
expansion yields the following equation for A:

J-1 q

A(X) o Sf.(X) + g„'...gv.i*( Sh ).g .....g (X) +

1=.0 ^ ° '^^^ 1=0
n+<i+l

I f fx) = rf/(x)

l=j+l ^ 1=0 ^

-40-

We can now calculate the complexity equivalence classes

X^ = { xeX I f^'(x) is defined }

and the associated class complexities C(A,X ).

We are assured of finding a new finite representation
for A(X) because:

1. A(X) (unexpanded) has a finite number of terms.

2. B(Y) has a finite number of terms.

3. A is assumed to be operating correctly so that X'c Y.
Therefore, B(X') has a finite representation. Intro-
ducing B(X') into A(X) yields a new finite representa-
tion for A(X).

This method is general enough to handle multiple lavels
of subroutine calls, multiple calls to the same subroutine from
the same calling program, and recursive subroutine calls. To
Illustrate multiple levels of calls, suppose that in the pre-
vious example, B called a program C with input set y'. We would
find the equation for C(Y'), substitute this into B(X'), and sub-
stitute the expanded B(X') into A(X), We know that the process
of subroutine calls must terminate since A halts for all input
values in X.

For the case of multiple calls from the same program,
assume A called B twice, once with input set X' and once with set X".
We would use our analysis procedure to find B(X*) and B(X") and sub-
stitute these into A(X).

-41-

Suppose A calls itself as a subroutine. These recur-
sive calls must terminate because of the halting condition. If
the set of values passed from A to itself is X'^ we know that
X'g X. After finding A(X'), we substitute it into A(X).

Example of Subroutine Call

We will consider a program A with input set X which
calls a program B with maximal input set Y,

A:

X = { 1,2,3, ...,9 }
p(x)=x iff x^ mod 3
g(x): x:«x+l

■42-

Y = { 3,6,9,12,15, ...,3.n }
h(y): y:=y/3
<i(y)='y iff y=o mod 3

Denoting the node "CALL B(x)" by "b", we have the following
equations :

A(X) - b.p(X) + b.p.g.p(X) + b.p.(g.7)^(X) + ...

- E f . (X)
1-0 ^

B(Y) - q.h(Y) + q.h.q.h(Y) + q.h.(q.h)^(Y) + ...

-Eh. (Y)
1-0 ^

Using our analysis on just the equation for A(X), we get the
following classes;

Xq = [ 3,6,9} X^ = C 2,5,8 } Xg - { 1,4,7 }

A(X) = !: f , (X)

1=0 ^

-43-

However, since the inputs in any X. undergo different transforma-
tions in B, these are not complexity equivalence classes. We will
have to substitute B(X') into A(X) wherever B is called from A.

Let Xl be the set of inputs to B when B is called from

fp. Then

X^ = b.p(X) = { 3,6,9 } C Y
Ho=hQ(X')= { 1,2 } x;^= [3,6 }

Since X' = X'^ U X^^.

B(XJ) = ShjX') = q.h(Xj) + q.h.q.h(X^)

Similarly, let XJ be the set of Inputs to B when it is called
from within the function f ^. Then,

x; = f^(X) = C 3,6,9 } = X^

B(X') = ih (x;)
1 1=0 ^ ^

Finally we let XJ be the Inputs to B when it is called from f .

X^ = f2(X) = [ 3,6,9 3 = X'

B(X:) = ih (X')
1=0 ^ "^

-44-

We can now substitute the three equations for B into the appropriate
places in A(X) to get the expanded version of this equation:

A(X) = B(Xj;).fo(X) + B(X{).f^(X) + B(X^).f2(X)

= hQ.fQ(X) + h^.fQ(X) + hQ.f j^(X) + h^.f^(X)

+ hQ-f2(X) + h^.f2(X)
5

= Ef;(x)

i=0 "■

Each of the classes X., i=0, 1,...5, now corresponds to a set of
inputs with a single complexity value; each X is a complexity
equivalence class.

3. LOOP Blocks

We have discussed the LOOP block in a previous chapter and
defined the complexity of this construction as follows:

If the body of the block is denoted by s and all state-
ments within s are a function of x, then the I/X)P block

LOOP N

8
END

will have complexity

C(s,x) + C(s,s(x)) + C(s,s (x)) + ... + G(s,s " (x))

Ue can now include LOOP's into our theory of program equations
and complexity equivalence classes. We will illustrate the

-45-

handllng of loops by several examples.

Suppose we have the following program A with input x:

LOOP X
h(x)
g(x)
END

If A has a valid input set X, we may represent A in graph format
as follows:

r

_x ^

i

halt

The dotted lines delineate the scope of the LOOP block. The '*x"
just outside the box defines x as the iteration control variable.
The equation for A(X) will have to indicate that g and
h will be executed a different number of times for each value of

■y '^.■-"■.■■-:5Tf?^^:|^,

•46-

X In X. Asstxmtng no attributable coiaplcxlty to^ND", the equation
is

A<X) « Z (g*hr(X)
xeX

If |x|«n, this sum will expand to h terms. If g and h have a
constant associated complexity^ there will be n complexity equiva-
lence classes for A witii set X, In this case.

X^ » { x^ 3 *i»d C(A,X^) « C(A,x^)

Let us consider the more complicated program B:

■^ ^^^^/^T» '■: -* ^W*- -* "^i^ :

^,;f-'- - U'-I

m^l '\: j

^■■r i-' .1. ^^T ,}V*.- * --^i,

halt

-47-

Now the LOOP block may be executed many times, the number of
Iterations each time (i.e., x) dependent upon previous executions.

To calculate B(X) we use our basic method: find those
inputs for which B halts on the first pass through p, on the
second pass, etc. Sum these terms to get B(X).

If fp is the functional path which terminates on the
first path through p,

fo(X) = ?.E (g»h)''(X)
xeX

Similarly, if f . denotes the functional path which ends on the second
pass through p,

f,(X)-p.j: (g»h)* .p.s (g»h)^(X)
x'eX' xcX

X'

Continuing in this manner.

B(X) » E f, (X)
i-0 ^

Since X is finite, there exists an n such that for all l>||^
fj^(X) - 0.

.'. B(X) = S f,(X)
1=0 ^

■48-

While this is a valid equation for B(X), each
X. = { xgX I f.(x) is defined } does not correspond to a single
value of complexity. Each f, will have to be subdivided by
expanding out the summations. The subterms, f!, will correspond
to single values of complexity; and the X,,

X^ = { xeX I f |(x) is defined }

will be complexity equivalence classes,

4, Multiple Inputs

We now extend our complexity analysis procedures to
programs with more than one input. We will deal specifically
with the case of two inputs. The generalization to programs with
n inputs is immediate.

Let A be a program with two inputs, x and y. Analagous
to the one input case, we define the set U. as

U, = { (x,y) I A halts for non-neg. integer inputs x and y; x and
y valid }

We then define U^ as the set of all ordered pairs of non-negative
integer values for the language in which A is written or the machine
on which A is running:

U^ = [ (x,y) ] X and y are allowable non-neg. integer values }

-49-
Then the maximal valid input set to program A is

z = Uj_nu2

As before Z'c Z is also a valid input set.

We can then proceed to find the functional paths of the
program graph from the starting node to a terminal node. The
function f associated with each path maps ordered pairs (x,y)
to (x*,y'). f is composed of arithmetic and conditional func-
tions. For any (x^y)eZ, we can calculate a complexity C(A, (x,y))
defined by a path through A.

Using the functional paths through A and noting that Z
is a valid input set we can obtain the finite equation for A(Z)2

A(Z) = Ef.CZ)

i«0 ^

The f . Imnediately lead to the complexity equivalence classes
\ = { (x,y)eZ I f ((x,y)) is defined )

and the equivalence class complexities

C(A,Zj^) = C(A,(x,y)), (x,y)eZ^

Before finding the Z , we may have had to expand the f to take
care of any function calls or LOOP blocks. We can now calculate

C(A,Z), C ^ (A,Z), and C (A,Z) as before,

max tnin

-5 0-

If we define the sets X and Y as

X = [x|ay such that (x,y)eZ}
Y = {y|3x such that (x^y)£Z}

we can easily see that Zc XxY. For (x,y)cZ implies that xeX

and yeY which imply that (Xjy)eXxY. Note that Z is not necessarily

equal to XxY. If, for inputs x^eX, ^o^^^ "^ does not halt, then

(x^^yg)^ z.

Inputs vs. Variables

Although A may have n inputs, it may have m variables
(including input variables) internal to it, in>n. One way to handle
the m-n variables which are not inputs is to express them in terms
of transformations on the inputs. Thus if y is not an input
variable, then before it is used, it must have a value assigned to
it. This value is a function of the n-tuple of input values:

y = f((Xj^, ...,x^))

Another way to solve the problem is to view A as having
m inputs. Those variables which are not actual Inputs have a
default value assigned to them, say zero. If A is working correctly,
these variables will be assigned new values before they are used for
their original default values.

-51-

5. Different Data Types

So far, we have limited our discussion to programs which
operate on non-negative integers from to intmax - an upper bound
imposed by either the language in which A is written or the machine
on which A is running. It is not conceptually difficult to extend
the complexity analysis procedure to cover other types of input.

In most implementations, negative integers range from
-1 to some lower limit: negmax . If we wish to analyze a program
A with an integer input (positive or negative), then our maximal
valid input set X would still be defined as U-nU2> but now

U- = {x| A(x) halts and x a valid integer input}

U« - {x| X an integer and negmax s x ^ intmax }

We would then procede as before to find the program
equation A(X), the complexity equivalence classes X , and the
subset complexities C(A,X. ).

We may also easily handle character input (the character
set on a machine is always finite) and floating point numbers
(always finite in number because of the limitations on the magni-
tude of the exponent and the precision of the fraction). In the
most general case, we would have a program with several inputs of
different data types. The work on multiple inputs would apply here^
generalizing to n-tuples with components of different data types.

-52-

6. String Input

We have not yet considered programs which have string
or variable length input sequences. An example of such a program
would be a compiler whose input is a series of characters in the
source language which it is required to translate. There is often
only one input variable in such programs; each time a new character
is read, the value of that character is placed in the input variable.
However, unlike previous programs where current values of variables
could be defined in terns of transformations upon older values,
the new value of the input variable in the case of string input is
not generally definable in terms of previous values. Although we
might like to treat this new value as a separate input, the number
of such new inputs is indeterminate, so we cannot directly apply
our previous methods.

One way out of this problem is to observe that compilers
and similar programs usually read their string of inputs until the
string is exhausted. Therefore, given an input string of n charac-
ters, we know that the program processing this string will read in n
distinct input values. If we view an n character string as being
n inputs to the program, then each length string defines a diff-
erent input situation. Thus, when analyzing the complexity of a
program with string input (where the string is always read in its
entirety), a set of valid inputs would be defined as a subset
of all strings of a fixed length, say n. We would then treat the
program as having n inputs.

-53"

Chapter V. Results in the Complexity Theory of Programming
Languages

1. Introduction

We are now ready to investigate some of the proper-
ties of the complexity-theoretic concepts previously intro-
duced. We define some complexity relations between programs,
define the complexity of concatenated programs, and study
equivalence of programs in the light of complexity theory. We
prove some results about these areas.

2. Preliminary Definitions

We first define a special form of the complexity
equivalence classes of a program; this form will be used in
most of the definitions which follow. It is a standard which
will be used in comparing equivalence classes of different pro-
grams.

Definition ; Given a program A with input set X,

we define the set X. as follows:

A

X. »« f X, I X, is a complexity equivalence class
A ** 1 ' 1

of A with input set X)

(VX^eX^)

Definitio n; X. is said to be in normal form iff

— — — A ' ' ' ' ' ' ■ ■ "

[X^7^0 (empty set) and i^^j ^ C(A,X^ VC(A,X,) ]

.54-

Normal form implies that all the complexity equivalence
classes X^ contain at least one element of X and the complexity
of any equivalence class^ C(A,X. )^ is unique. It is this
normal form which allows us to compare the set of all equiva-
lence classes of two different programs.

If X^ is not in normal form, it may be put in this

form by deleting any X^= 0, and if C(A,X^) = C(A,X.), then

replacing X and X. by a new equivalence class X = X U X.

where C(AjX^.) = C(A,X^). Note that the deletion of X = or

the creation of X does not change C (A,X), or C . (A,X), or
*-J max min

C(A^X). In the case of G(A,X), we have

X^= -> Pr(X^) =

Also if we form X = X U X., then since X fl X = 0,
Pr(X^j) = Pr(X^) + Pr(X^)
;. C(A,X^) • Pr(X^) + C(A,Xj) . Pr(X.)
- C(A,X^j) . (Pr(X^) + Pr(X^))
= C(A,X^^) . Pr(X_)

If we replace X and X. by X ,, C(A,X) remalas the same*

Next we define a shorthand notation for the complexity
equivalence of single inputs and equivalence classes of inputs.

■55-

Deflnition ; Given a program A, the relation = is
defined as follows:

For K^, x^ valid inputs to A, x = X2 iff C(A,x ) =

C(A,X2)

For X., X. equivalence classes of A, X = X. iff

J 1 A i

C(A,X^) = C(A,XJ

It is easy to see that = is an equivalence relation (reflexive^
symmetric, transitive) since it is defined in terms of another
equivalence relation (=).

Lemma 1 ; If X. is in normal form and x-^x^gX, then
Xj^«=^X2 iff (a unique X^eX^ such that x ,x eX )

Proof: (<?.) Xj^,X2cX^-> C(A,Xj^) = C(A,X2) •♦ \=^^2

(">) Xi=A^2"* ^(^^^1^ ^ C(A,X2). Suppose
Xj^eX and x^eX^. Then X =^Xj^. By normality of X^, ^f\^
Then Xj^,X2eXo X.eX . Since the X are disjoint, X. is unique.

QED

The lemma gives us another statement of the normal
form condition - all inputs with the same associated complexity
are in the same equivalence class. This result will be useful
in later sections.

-56-

3. Relations Between Programs with Identical Input Sets

We now turn to relations between programs which have
the same valid input set.. We will develop a series of equiva-
lence relations between such programs, based on different com-
plexity properties of the programs.

Similarity

The first equivalence relation, which we now define,
is also the weakest. It specifies a relation between programs
which divide the same input set into the same complexity
equivalence classes.

Definition ; Given programs A and B, both with input
set X and with X. and IL in normal form, A and B are similar
on X iff X^«Xg.

Similarity between two programs on a given input set
is equivalent to saying that two inputs to one program have the
same complexity if and only if they have the complexity when
input to the other program. This may be formally stated in the
following theorem:

Theorem 1 ; A and B are similar on X iff
(VXj^,X2eX)IXj^=^X2<?=> ^"^^^2^

■57-

Proof: (=>) Suppose ^-^"=^^^2' "^^^^ ^^ lemma 1,
x^jX^e X^j X^G X^ where X^ is in normal form. But by similarity^
^A^'^B' ^^^" ^i^ ^B* Therefore, ^-^=^^2' ^^^^ewise, ^i=^^2^> ^rA^2'

(<r=) Construct X. £ X from all xeX such that

X =^x ; similarly construct X ! e X^ from all xeX such that x = x .
^ i- i B B i

Since x =^x^ iff x ^^x^^^, X^= X_J. Continuing for all x eX, we
see that X^e X^ iff X^e X^. Thus X^= X^. None of the X are
empty (because = and = are reflexive); if C(A,X. ) = C(A,X.),
then by hypothesis C(B,X ) - C(B,X ). Therefore in normalized
form, X.« X^. Thus A and B are similar on X. QED

Although the complexities for the two inputs (x- and x^)
are the same in each program, the magnitude of this complexity
for the two programs is, in general, different. Thus x.^.Xj

and Xj^«gX2 ^^ ^^^ imply that C(A,Xj^) = C(B,Xj^).

Similarity between two programs is preserved by
taking subsets of the original input set, and by Intersection
of different input sets which induce similarity.

Theorem 2 ; if A and B are similar on X and also on Y,
they are similar on Z C X and on X n Y.

Proof: (Subset) Let [x. | = n. Define Z =Z D X ,

VX, ex.. Then
i A

n
i=

u z = u (z n X.) = z n(u x.) = z n x = z
=1 ^ 1=1 ^ i=i ^

■58-

ALsoj for i ^ j,

z.n z^ = (z n xp n (z n x^)
z n (x.n Xj) = z n =

Eliminating any Z^= 0, Z^= [Z^] is in normal form. Since
X^= Xg, Z^= Z , Therefore A and B are similar on Z.

(Intersection) XflY ^. Using the first part
of the proof, A and B are similar on X fl Y. QED

Absolute Similarity

Certain pairs of programs may be similar on all valid
input sets. We have the following definition:

Definition : A and B are absolutely similar iff
A and B are similar on all valid input sets X.

Lemma 2 : If A and B are similar on the maximal
valid input set X, they are absolutely similar.

Proof: All valid input sets are subsets of the maximal
valid input set. By theorem 2, since A and B are similar on
the maximal input set, they are similar on all valid input sets.
Thus A and B are absolutely similar. QED

-59-

Leirrma 3 : If A and B are absolutely similar, and if
A and B are similar on X and on Y, they are similar on X U Y,

Proof: The union of two valid input sets is also a
valid input set. By hypothesis, A and B are similar on all
valid input sets, QED

Homomorphism

We define the relational operators < and > for single

A A

inputs and input classes.

Definition ; For x^^X2 valid inputs to A, x > x

iff C(A,x^) > C(A,X2) and x^<^X2 iff C(A,x^) < C(A,x ),

For X., X. equivalence classes of A, X^ >,X. iff
i J i A J

C(A,X ) > C(A,X.) and X <-X. iff C(A,X,) < C(A,XJ.

Similarity tells us that for B similar to A, x =.x

iff Xj^-gX^. It also implies that x^^/^x^ iff ^^^ x^. We cannot
say that x-<.x implies x < x^j nor can we establish any other
ordering relation on the complexities of inputs. To do this,
we need to define another relation on programs with the same
input set which will tell us something more about the relative
magnitudes of the equivalence class complexities. This leads
to the following definition:

-60-

Deflnltlon : Suppose A and B are similar on X and we

order the equivalence class complexities C(A,X.) and C(BjX )

^ J

in strictly increasing order (strictly increasing because X
and X^ are in normal form) to form two n-tuples (n= |X | ) :

(C(A,X^), C(A,X2), ..., C(A,X^))
(C(B,X^). C(B,Xp, ..., C(B,X^))

Because X^= X„ we know that:
A B

m^e\) (3X^e Xg) IX^= X']
(VX^e Xg) (3XjG X^) [X^= Xj

Then A and B are homomorphic on X iff (TI S i s: n) [X =: X* ]

The main property of the homomorphic relation^ that
the order of complexity is preserved in homomorphic programs,
is shown in the following theorem:

Theorem 3 : A and B are homomorphic on X iff (Vx.^x^e X)
[x^ relop^ X2<=> x^ relop ^ x^]
where relop e {>,=,<} and is the same in both cases.

"61

Proof: (=>) Let Xj^gX , X2eX.. Suppose C(A,X.)
is in the i position of its ordered n-tuple (as above) and
G(A,X ) is in the j position. By homomorphism, C(B,X ) and
C(B,X ) are in the i — and j — positions of their n-tuple. Then

C(A,X^) relop C(A,X,) iff C(B,X ) re lop C(B,X.)

J J

C(A,Xj^) relop c(A,X2) iff C(B,x ) relop C(B,k )
X- relop X iff x relop x

where relop is the same operator in all cases.

(<?=) By hypothesis, (Vx^,x^ eX) [at relop^ x^ iff

^1 £ii2Eg ^2^' ^" particular, ^l'^A^2 ^^^ ^l'^B^2^ ^° ^^ theorem
1, A and B are similar on X, Now assume that in the ordered
n-tuples for C(A,X^) and C(B,Xp, (31) [X^?^ X^]. Let i be the
smallest index for which this is true. Suppose the element
corresponding to C(A,X ) is C(B,X ). We then have:

(C(A,Xj^), C(A,X2). ..., C(A,Xj,) ...)
(C(B,X^), C(B,X2), .... C(B,X^) ...)

Since i is the smallest index, then X, could not have occurred
within the first 1-1 complexities C(A,K ), k s i-1. Therefore,
C(A,X.) > C(A,X.). Similarly, C(B,X ) > G(B,X.). If x e X.,

J *' X- J L 1.

K^e X,, x^< X but X > X . Contradiction! Then our assumption
that X 7^ X' was wrong. Thus A and B are homomorphic on X. QEI)

The homomorphism relation gives us the information we

-62-

need to determine the relative magnitudes of the equivalence
class complexities of two programs. To see if homomorphism
holds^ we need only examine the two sets of complexity equiva-
lence classes and the associated complexities. If |X | « |x|^
the number of objects we will have to examine is small compared
to the total number of inputs.

As with similarity, homomorphism is preserved by
subset and intersection of input sets:

Theorem 4 : If A and B are homomorphic on X and on Y,
they are homomorphic on Z Q X and on X n Y.

Proof : (Subset) By theorem 1, A and B are similar on
Z. We can order the equivalence class complexities of A and B
on Z:

(C(A,Z^), C(A,Z2), ..-, C(A,Z^))
(CCB.Z^), C(B,Z^), ..., C(B,Z^))

where Z.=X.n Z and Z'=X'n Z as in the proof of theorem 2. Since
X.= X'j then Z.= t\, VZ.e Z (= 'z ) /. A and B are homomorphic on Z.

(Intersection) XflY cX. Using the first part
of this proof J A and B are homomorphic on X Pl Y. QED

-63-

Homomorphism and Complexity Limits

In Chapter III^ we Introduced the notion of maximum
and minimum complexity on a set of Inputs. We noted that these
quantities bound the resource usage of a program on a particular
input set. Here, we formally define these two quantities and
also two others.

Definition ; Given a program A with input set X and
X^ in normal form, we can define the following quantities:

X^^(A) = that X, for which C(A,X,) = C „ (A,X)

m*ix 1 t max

X^^^(A) = that X^ for which C(A,X^) = C ^^CA^X)

^max ^^^ \ln ^^^ unique by the normal form condition. They tell

us which inputs will consume the greatest amount of the resource

being measured and which inputs will consume the least. These
two quantities are preserved by homomorphism:

Lemma 4 : If A and B are homomorphlc on X, then

■■■ ■''■■mW' ■;-7"!i.

t: ^mw]

•64-

Buferby

Proof : SuppoM i„^^(A.) - X^. Tli«n

(VX^eX^, V l^^^l'^A^j^

'')-■: s :y^.:^iir

(^jS Xg, Xj?«X^)lX^'^ Xj]

;l.i-^,v -■•••■i"

.•.\i,(^)-^,

Similarly, for X^^^CA) - X^^<B). QED

We also state, without proof, the following
which relates complexity limits to subMjtts of inpttt sets:

,X) and

^ if Z c X, thai C^,(A,2) s

Thus c _ and C for the maximal valid input set limit the
corresponding quantities for all other valid input sets.

«:•; «. =J-.

-65-

Absolute Homomorphlsm

Analagous to absolute similarity, we can define two
programs to be absolutely homomorphlc if they are homomorphic
on all valid input sets. Results corresponding to lemmas 2
and 3 can be shown for absolute homomorphism.

Isomorphism

We now define one last equivalence relation between
programs with the same input set; this relation is stronger
than homomorphism.

Definition : A and B are isomorphic on X iff they are
similar on X and

(VX.e\)[C(A,X.) = C(B,X^)]

Isomorphism is a special case of homomorphism:; the
complexity associated with any equivalence class is the sa-u-e
for both programs. As we would expect, isomorphism is preserved
by subset and intersection of input sets; however^ it is also
preserved by union of input sets.

Theorem 5 : If A and B are isomorphic on X and on Y,
they are also isomorphic on Z c X, X fl Y^ and X U Y.

Proof : (Subset) By theorem 4, A and B are hociomorphic
on Z.

(VZ^G Z^) [C(A,Z,) = C(A,X^) = C(B,X.) = C(B,Z^)]

W^---M-^-'''^" '^'^

'■w-^- T ■ ■■:■■

v^-:-iu 'bsj.:^-'y' '■■ ^.-

^v^Mi.:

(Intersection) Follows from etdiset proof.
(Union) Let Z « X U t» First iffetttfcist show

U { X^U Y. I X^.^Y 3

Similarly, 2^- Z^j^U Z^\J Z^y By hypothesis, X - X^, Y « Y

(VX^,Yj)[C(A,X^) . C(»,;f^) f <?(4,3Cj) * C(»^Yj)
••• \l- Si •~* 2^ - Zb2

since X^^l^.-r>^-^Tj,, 1^3- J^3,.. -^ -

r>.3

C(A,Zj)

C(A,X^), Z^« X^

^^^^^.j^b;.. . ^.

. /^ir-'-^v- n

--^^. ^. teis i^ ^ -^-.o::-/': t« (:5^<^d^^E> ^|!E:2S.

•67-

C(B,Z. ) = f C(BjX.), Z = X.

^ J ILL

^ C(B,Y.), Z^ = Y.

(Note: if Z.= X.U Y., C(A,X^) = C(A,Y. ) and C(B,X.) = C(B,Y.))

But by hypothesis, C(A,X^) = C(B,X.) and C(A,Y. ) = C(B,Y, )
Therefore, C(A,Z^) = C(B,Zj, ^Z^e Z^. Thus A and B are isomorphic
on X U Y. QED

Isomorphism not only preserves X and X , but also

max min

*^max^ ^min^ ^"^ C(A,X); we state the following easy leimna without
proof.

L^maa^: if A and B are isomorphic on X, then

C(AjX) = C(B,X) if the methods used in selecting tlie
inputs to A and B (i.e., Pr(X^)) are the same.

we discuss concatenation of programs.

4. Concatenation

We will now investigate the complexity properties of
programs which may be combined or concatenated to from larger
programs. By the concatenation of two programs A and B, we
mean the program which is formed by appending T; to the eud jf A,

-68-

so that when the locus of execution reaches the enu of A^ it
will enter B, We will assume that A places the outputs of its
computation into certain registers and that these same registers
are used by B as inputs. B is not requred to use ail of A's
outputs as inputs: however^ B cannot need more inputs than A
can supply. We will say that A is l/O- compatible to B if this
restriction is obeyed.

We will also assume that neither A or B is changed
by the concatenation of B onto A (which will be denoted by
*'A«b"). We run into at least one trouble spot: If A finishes
execution by halting somewhere in the middle. To continue with
the execution of B, we would need to change this HALT instruction
into a BRANCH to the end of A, This modification would probably
change A*s complexity with respect to certain inputs. To avoid
this difficulty, we will assume that all programs teniiinate by
executing their last instruction. Thus we rnay perform cunca-
tenation without changing instructions in either r i ogram.

If we wish to study the complexity of A»B for an input
set Xj the inputs to B will have to be valid haltin- inputs.
Therefore, A(X), the set of inputs to B, will always be a ^^alid
input set. Given this fact, the complexity of A»B for any input
X can be determined by linearity:

C(A.B,x) - C(A,x) + C(B,A(x))

-69-

Compatlbllity

We now place another restriction on the concatenation
of programs which will enable us to prove some properties on the
complexity of concatenation. This restriction will be defined
in two stages as follows:

Definition : Let A be I/O- compatible to B and for X

a valid input set to A, A(X) = Y is a valid input set to B.

Then A is X^ - compatible to B iff for X^e X^ (in normal form),

there exists Y.e Y_ (in normal form) such that A(XJ = F e Y
J 15 i i j'

X^- compatibility tells us that a set of inputs (X )
which, by definition have the same complexity for A (C(A,X ))
will be transformed into a set of inputs (F ) for B which will
all have the identical complexity - C(B,Y.). We now extend
this relation to all equivalence classes of X:

Definition ; A is X- compatible to B iff A is X,-con^atible
to B for all X^e X^ (in normal form).

If A is X-compatible to B, it is easy to see that for
x-,x^e X,

''1V2 => *rA.B^2

However the converse is not necessarily true. If x-t^.x and
A(Xj^) T^g A(X2), it may still be that \=^,J>^2- ^^ theorem 1,

-70-

we conclude that A and A»B are not necessarily similar on X.

Compatibility is preserved by several operations on
input sets. We state the following two theorems without Including
the proofs.

Theorem 6 ; If A is X-compatible to B and also Y-
compatible to B, and if Z ^ x, A is Z- compatible and X Y- com-
patible to B.

Theorem 7 ; If A is X-compatible to B and also Z-

compatible to B, and if for A(X. ) c y. and A(Z. ) ^ T.,

X =.Z, => Y = T,j then A is X U Z-compatible to B.
j[ A 1 1 B 1

In the case of theorem 7, we need the additional
restriction that (X.^.^ => Y,= T. ) because if two equivalence

classes (X.,Z ) must be combined in the normal form of (X U Z)

as a result of having the same complexity 0^j^A^,)f the images
(Y.jT ) of these classes under A must also have the same
complexity (Y ,%T.) so that A(X.U Z.) ^ Y, U T^ .

X JD X X X XX

Compatibility and Isomorphism

To conclude the section on concatenation, we discuss
some relationships between compatibility and isomorphism.
We would like to see under what conditions isomorphic programs
can be concatenated to yield programs which are still isomorphic.
We first show that isomorphism is preserved by the concatenation
of the isomorphic programs to a program which is compatible to

-71-

both:

Theorem 8 : If A is Y-compatible to B and to G^ and
if B and C are isomorphic on X = A(Y)^ then A»B and A»C are

isomorphic on Y.

Proof: Let D = A^B^ E = A»C, We need to show that

Y - Y^ and (VY. gY^) [C(D, Y. ) = C(E,Y.)], Y^ is composed of two
u h 1 D 1 1 D

subsets :

V ^\'\\ (^j^v^Vd^j^ ^

U[Y.UYj Y.=^Yj^ } = Yd1^^D2

Similarly, Y^= Y^^U Y^^

Since C(B,X^) = C(C,X ), VX.e IL, it must be that

\r "^El ^""^ "^02= ^2- Therefore, Y^= Y^

Then (VY.e Y^^),

C(D,Y. ) = (by linearity)

C(A,Y,) + C(B,A(Y.)) = (by isomorphism)
C(A,Y.) + C(C,A(Y.)) = (by linearity)
C(E,Y.)
.'. A«B and A«C are isomorphic on Y. QED

-72-

Since A«B and A»C are isomorphic, we can continue the
concatenation if we can find a program D which is Z-compatible
to A»B and A-C and where D(Z) c Y. Then D.A'B and D'A«C will
be isomorphic on Z.

We now show the conditions under which the concatena-
tion of pairs of isomorphic programs results in programs which
are also isomorphic.

Theorem 9 : Suppose A and B are isomorphic on X,
A is X-compatible to C, B is X-compatible to D, C and D are
isomorphic on Y = A(X) U B(X), Then A«C and B*D are isomorphic
on X iff

(VX.eX )OY,e Y )[A(XJ, B(X ) cy.]

1- A J c I 1 J

Proof : (<^) Let E ~ A»C, F = B»D. Then

U [ X^U X. 1 X^=^X^ } = X^^U X^2

Similarly, \= \^\J X^^

Since A(X^), B(X^) c y and C and D are isomorphic
on Y, C(C,A(X.)) = C(C,Y.) = C(D,Y.) = C(D,B(X )). Also,

C(A,X ) = C(B,X ), VX e X Thus, C(E,X. ) = C(F,X.), VX .

^ i- 1 /i 1^ 1 i

-73-

•■• ^r ^1, V ^2 ^"^ '^^"^ V \

Then (VX^eXg),

C(E,Xj^) = (by linearity)

C(A,X^) + C(C,A(X^)) = (by isomorphism)

C(B,X^) + C(C,A(X^)) = (by hypothesis)

C(B,X^) + C(C,Y.) = (by isomorphism)

C(B,Xj^) + C(D,Yj) = (by hypothesis)

C(B,X^) + C(D,B(X^)) = (by linearity)
C(F,X^)

A»C and B»D are isomorphic on X.

(=>) Given A(X^) £ Y , B(X ) C Yj^, we want
to show that Y = Y^, For X^e X^, C(E,X^) = C(F,xp. By
linearity, C(A,X^) + C(C,A(X^)) = C(B,X^) + C(D,B(X^)). But
C(A,Xj^) = C(B,Xj^) so we must have that C(C,A(X )) =
C(D,B(X^)); or C(C,Y ) = C(D,Y^). But since C and D are
isomorphic on Y, then Y = Y_ (in normal form). Therefore,
Yj= Y^^. QED

5, Functional Equivalence

It Is often the case that we wish to examine two
programs which represent different algorithms for the same
function. The programs compute the same output when given the
same input from a specified input set; however, the method

-74-

(algorithm) used to compute the result is not the same. This
situation often arises when we wish to determine which version
of a particular subroutine or program we should use. All
versions represent (hopefully!) the same function, but one
version may use less resources than another. Program equivalence
may also arise in the area of simulation. If one program is
simulating another^ and if the first program is also producing
the same output as the second, then the programs are equivalent.
Equivalence is defined here in terms of similar func-
tional behavior on a specified input set. Output values are
defined as transformations on input values. If output variables
are not also inputs, their values must be expressable in terms
of the inputs.

Definition: A and B are equivalent on X iff
(Vx G X)[A(x) = B(x)]

It is easy to see that equivalence is closed under
subset, union, and intersection: if A and B are equivalent
on X and on Y, they are equivalent on Z £ X, X Y, and X U Y. Equiva-
lence and concatenation are related by the following lemma:

Lemma 7 : If A and B are equivalent on X, C and D
equivalent on Y where A(X) c y; and if A, B are l/O- compatible
to C, D then A»C, A»D, B«C, B»D are equivalent on X,

-75-

Proof: By l/0-compatlblllty restriction, A»C, etc.
may be correctly formed. Now for xeX, A(x) = B(x) = y. Since
yeY, C(y) = D(y). Therefore, A.C(x) = B*D(x) and A.C and B.D
are equivalent on X. Similarly for the other cases. qed

To conclude this chapter^ we present the following
theorem which relates isomorphism, compatibility and equivalence:

Theorem 10 ; Suppose A and B are isomorphic and
equivalent on X, A is X*compatible to C, B is X-compatible to D,
C and D isomorphic on Y ^ A(X) ^ B(X) (since A and B equivalent).
Then A«C and B»D are isomorphic on X.

Proof t We look at theorem 9. Since A(x) = B(x),
Vx G X, A(X^) = B(X^), VX^e X^. Since A is X-compatible to
C, A(X^) C Y , Ye Y^. But by the isomorphism of C and D on
Y, Ye Yjj. Then B(X^) c Y . Therefore, A*C and B*D are isomor-
phic on X. QED

If C and D are equivalent on Y in addition to being
isomorphic, we can use lemma 7 to conclude that A»C and B*D
are isomorphic and equivalent on X«

-76-
Chapter VI. Conclusions and Suggestions for Further Study

Conclusions

We have developed a general theory of computational
complexity for computer programs. We have looked at complexity
from the viewpoint of resource usage and regarded the use of
different resources as different measures of the complexity of
a program. Many complexity measures fit into our theory, the
only requirement being that the usage of the associated
resource obey the linearity principle.

Our theory has been based upon observing the behavior
of a program on a valid set of inputs. We have also relied
extensively on an equation representation of the transformations
which a program applies to its inputs. We have attempted to
justify the use of finite input sets by noting that real pro-
grams, running on real computers, are able to accept and
manipulate only a finite number of distinct input values, due
to software and hardware limitations. In addition, most pro-
grams actually process only a subset of all possible input
values. Our methods were shown to be valid for programs with
a large number of programming linguistic constructs and with
several different input schemes.

The complexity analysis of a program produced a set
of complexity equivalence classes and a corresponding set of
class complexities. We have noted that most programs do not

-77-

treat every input value differently, and therefore the number
of equivalence classes will generally be smaller than the number
of possible inputs. Thus, these two sets give a relatively
compact representation of the complexity information for a program
operating on a given input set. These sets immediately led to
some complexity parameters for the program:

1. C(A,X) - the expected complexity value on the set X

2. C^^^(A,X) - the maximum complexity on the set X

^' "^roax^^^ " ^^^^^ inputs which result in complexity C (A,X)

^- C^^^(A,X) - the minimum complexity on the set X

^' \in^^^ " ^^^^® inputs which result in complexity C (A,X)

^max^^^^^ ^^^ ^min^^^^^ bound the resource usage of program A
and tell how much of this resource A can possibly use and how
much it must use to process inputs from set X.

Our basic purpose has been to develop some theoretical
tools for studying the difficulty of computing functions by
observing the program implementations of these functions.
While these techniques will work for any program of the tjrpes
described, the necessary computations will become unmanageable
if the program equation is complex or if the number of complexity
equivalence classes is large. Thus, one could not expect to
sit down and find the equation and equivalence classes of a
PL/l compiler, just as one would be hard-pressed to prove that
such a compiler is "correct*'. Our techniques will probably
be most valuable in the analysis of small programs and also

-78-

in deciding which version of a particular program will be most

suitable for use; suitable in the sense of using the least

resources over a particular set of inputs^ where we may wish

to minimize C(A,X), C ^ (A,X), or C , (A,X).

max min

Programmers and systems analysts are often faced
with this latter problem, particularly in a large programming
system such as a language translator or operating system which
is modularized and which has its basic components frequently
replaced as the system is up-graded. The complexity analysis
techniques allow different versions of program modules to be
compared, perhaps in one case on the basis of maximum resource
usage, in another with regard to average resource usage, in a
third with respect to some combination of complexity parameters.

Areas for Further Study

We have used the concept of a program equation for
much of our work. This equation is independent of a study
of complexity. It gives a concise algebraic formulation of
a program or algorithm. It is also valid on an infinite input
set as long as all elements in this set are halting inputs for
the program in question. The equation brings to light the
transformational characteristics of a program. It should be
useful in the study of other aspects of programs and programming
languages, such as program semantics and program correctness.

-79-

A subject which we have not examined but which is
important is the use of space as a complexity measure. The
use of space did not follow the linearity principle. An
analagous constraint for space and its related measures would
have to be devised to analyze the complexity of programs with
respect to these resources.

We have not discussed the effects of transformations

of the program on the complexity. For example, suppose we make

a well-defined modification to program A with input set X.

Is there a well-defined effect on X., C (A,X), C , (A,X)?

A max min

Cooper's work on graph transformations [8] may be useful here.
We have defined three equivalence relations between

programs with the same input set. There may be other relations,

intermediate in strength between similarity and isomorphism
which reflect other programming situations.

We have mentioned that it is advantageous for

|X| » |X^1. In this case, X^ and { C(A,X^) ] X^e X^} give
us a more economical representation of the complexity informa-
tion for A than if we simply looked at all x in X. However, we
have not explored the relationship between the relative size of
X^ and the nature of A itself. Are there certain conditions for
which we get this economical representation?

-80-
Appendix - Mathematical Notation

Sets

We make use of the notion of a set and various opera-
tions upon it. A set is an unordered collection of objects
and is named by a capital letter. The elements of a set are
enclosed in braces and separated by commas.

A subset of a given set is another set containing allj
some^ or none of the elements of the original set. If Z is
a subset of X^ we write Z C X. If a subset contains no elements^
it is called the empty set and denoted by 0.

The size of a set X is written |x| and is simply the
number of elements in X.

The union of the sets X and Y^ denoted X U Y, is a
set containing those elements which are in either X or Y or
both.

The intersection of sets X and Y^ denoted X H Y^ is
a set containing those elements which are in both X and Y.

We denote membership in a set by the symbol e. Thus,
a e { a^b }. Similarly^ c £ [ a,b }.

A set may be specified by describing the conditions
for membership in the set rather than listing all of the members.
We use the notation

X = { X I "conditions" )
This may be read 'X is the set of all objects x such that the

-81-
condttions are true". Thus the set
Y = [ X I xeX^ and xgXg }
describes Y as the intersection of X- and X«

Ordered Pairs, Cross Products

An ordered pair, denoted (x,y), is an object with
two components. The order of the components is significant:
(x,y) is distinct from (y,x) unless x = y,

A set of ordered pairs can be formed from sets of
single objects by the cross product operator. The cross pro-
duct of the sets X and Y, denoted Xx^ is defined as follows:

XxY « { (x,y) I xeX and yeY J

Quantifiers

We use two logical quantifiers in our notation. The
universal quantifier, denoted V, may be read "for all". It
is used to qualify the statement following it. Thus,

(V^ e X)]x ^ y]

means "for all x in X, x ^ y".

The existential quantifier, 3, may be read "there
exists". The statement

(3x G X)[x 5 y]

-82-

means "there exists an x in X such that x ^ y".

Quantifiers may be grouped in a series to form more
complex logical statements. We might have

(Vx e X)(ay e Y)[x ^ y]

which states that "for all x in X^ there exists a y in Y such
that X ^ y".

Functional Composition

The composition of functions f and g, denoted f«g, is
another function which transforms the domain of g into the
range of f. Thus if g(x) = y and f (y) = 2, f»g(x) = z. Composi-
tion may be continued for any number of functions. If a function
f is composed with itself i times, we may abbreviate this as f .

Implication and Equivalence

If the truth of statement A implies the truth of
statement B, we write

A => B

This may also be read "if A then B", Similarly if B implies
A, we write

B => A

•83-

If A and B laply aadi othttt, thay art said to hB
aquivalaiit and va vrlta

/^?

■ ::>.'Tl^ m ■ ■ ^SiSa^- r*^/^# vv.

a^'

t:r^, -^iv-;:-^#^3f*^^.*^^

;^^^"..

Irl^

?:''>;H

^

i^^

,;"

:V/ .■

p^

i- \ •= .

^ ■ '^ ^■' .

15

^^^^'-^

Vb cam aUo irrlta aq^vitlaiMa aa

iiii«i ^^'^ l0 an abbreviation for '^f and only If '^

i . ■.,.■■..

■ ^' ^ ii'*'^'

m

-84-

References

1. HartmaniSj J. , and j/ E. Hopcroft, "An Overview of the
Theory of Computational Complexity*'^ Technical
Report No. 70-59, Department of Computer Science,
18(3), July 1971, for a later version of this report).

2. Landin, P. J., ''The Mechanical Evaluation of Expressions",

Computer Journal 6, 4 (January 1964) pp. 308-320.

3. Blum, M. , "A Machine Independent Theory of the Complexity

of Recursive Functions", JACM 14 (1967) pp. 322-336.

4. Meyer, A. R. , and D. M. Ritchie, '*rhe Complexity of LOOP

Programs", Proceedings of the 22— National ACM Conference .
(1967), pp. 465-469.

5. Ramamoorthy, C. V., "Discrete Markov Analysis of Computer

Programs", Proceedings of 20 — National ACM Conference .
(1965), pp. 386-392.

6. Zeiger, H. P., 'iForroal Models of Some Features of Programming

Languages", Proceedings of the 3 — Annual Princeton
Conference on Information Science and Systems , (1969),
pp. 425-429.

W'^^'m

■if
i

7. DrakA, A., Fundaaantala of Applied Probability Theory,

MW}r«w-Hlll Book Conpany (1967), N«r York,

8. Cooptr, D. C, "Bom TransfonMtlou and Standard Foms of

^aphs, with Applications to Gooputar Frograms",
In Ifcchlna Intalllaanca 2, Dala and tfilcfala («d. )
iMrloan KlsavUr Publishing Coapany, Inc., (1968),
Mw Tork.

TINrT.A.q.qTFTFT-l

Security Classification

DOCUMENT CONTROL DATA - R&D

(Smcurity ctaamiticmtion o/ titt; body of mbmtrmct mnd indexing mxnotmUon mu9t b« «nr*red when the ovmratt report i» ciaMsitimd)

\. ORIGINATING ACTlvrTY (Corporate author)

Massachusetts Institute o£ Technology-
Project MAC

Za. REPORT SECURITY CLASSIFICATION

UNCLASSIFIED

2b. GROUP

None

3. HtPORT TITLE

Complexity Measures for Programming Languages

4. DESCRIPTIVE NOTES (Type of report and inclueiv

Technical Memorandum

5. AUTHOR(S) (Last name, tirat nmne, initial)

Goodman, Leonard I.

6. REPORT DATE

September 1971

7«. TOTAL NO. OF PAGES

86

7b. NO. OF REFS

0a. CONTRACT OR GRANT NO.

Nonr-4102(01)

b. PROJECT NO.

9a. ORIGINATOR'S REPORT NUMBERIS)

TM-17

9b. OTHER REPORT NOtS) (Any other number* that may be
aeeigned thie report)

10. AVAILABILITY/LIMITATION NOTICES

Distribution o£ this document is unlimited.

t1. SUPPLEMENTARY NOTES

None

3D-200 Pentagon

Washington, D.C> 20301

13. ABSTRACT

A theory of complexity is developed for algorithms implemented in typical prograimning
languages. The complexity of a program may be interpreted in many ways; a method for
measuring a specific type of complexity is a complexity measure -- some function of the
amount of a particular resource used by a program in processing an input.

After the complexity of the basic program elements is determined, program complexity is
analyzed with respect to single inputs and then with respect to finite sets of legiti-
mate halting inputs. A program equation is developed to aid in the complexity analysis.
Using this equation, an input set is partitioned into classes of constant complexity.

Several equivalence relations are defined, relating different programs by their complex-
ity. Complexity is also discussed in terms of concatenation and functional equivalence
of program.

14. KEY WORDS

Computational Complexity
Program Resource Usage

Complexity Measures
Programming Languages

Program Equations

Program Equivalence Relations

DD/nSei. 1473 (M.I.T.)

Security Classification